-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Response Parsing Protocol Tests #3247
base: develop
Are you sure you want to change the base?
Update Response Parsing Protocol Tests #3247
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## develop #3247 +/- ##
==========================================
Coverage ? 93.05%
==========================================
Files ? 66
Lines ? 14507
Branches ? 0
==========================================
Hits ? 13500
Misses ? 1007
Partials ? 0 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handful of questions about the current implementation.
cleaned_value = { | ||
k: v for k, v in cleaned_value.items() if v is not None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand how we're getting to a state where we have None
values arriving here. Do we have an example of what that case might look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rust SDK has experienced issues (aws-sdk-rust#1095) when parsing responses from services like quicksight who populate additional union members with null values. Although the quicksight case doesn't specifically affect boto3 due to differences in modeled shape between Smithy and C2J, it’s possible for any other service to send us this type of response for a union structure.
Related Protocol Test Id: AwsJson10DeserializeAllowNulls
botocore/parsers.py
Outdated
@@ -577,7 +590,7 @@ def _do_parse(self, response, shape): | |||
return self._parse_body_as_xml(response, shape, inject_metadata=True) | |||
|
|||
def _parse_body_as_xml(self, response, shape, inject_metadata=True): | |||
xml_contents = response['body'] | |||
xml_contents = response['body'] or b'<xml/>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something unique about this case or should this be handled farther down in _parse_xml_string_to_dom
? I'm curious what valid cases we're expecting the service to send us a body and we're getting nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something unique about this case or should this be handled farther down in _parse_xml_string_to_dom?
The RestXMLParser
handles this similarly by checking for an empty string before passing it to _parse_xml_string_to_dom
:
def _initial_body_parse(self, xml_string):
if not xml_string:
return ETree.Element('')
return self._parse_xml_string_to_dom(xml_string)
It would make sense to move this logic into _parse_xml_string_to_dom
to handle both the restxml and query cases.
However, you bring up a good question:
I'm curious what valid cases we're expecting the service to send us a body and we're getting nothing.
The case the protocol tests are handling is if a service doesn't define output members, it expects nothing in the body. REST-XML services send us an empty body, however, query services send us something similar to the following as opposed to an empty body:
<?xml version="1.0"?>
<DeleteDomainResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/">
<ResponseMetadata>
<RequestId>c192b7ed-2f6f-ad6b-1a38-da6dce8d7bdb</RequestId>
<BoxUsage>0.0055590278</BoxUsage>
</ResponseMetadata>
</DeleteDomainResponse>
Conclusion
For now, I think I can move this logic into the test runner instead of our parser until we have known cases where this happens or get more guidance from smithy if this case should actually be handled.
botocore/parsers.py
Outdated
if '#' in code: | ||
code = code.rsplit('#', 1)[1] | ||
code = code.split('#', 1)[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is semantically different. Are we enforcing that the code is everything following the first #
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a : character is present, then take only the contents before the first : character in the value.
If a # character is present, then take only the contents after the first # character in the value.
I changed this to try and follow the smithy guidance as close as possible. I'm not aware of any cases where there are multiple #
in the protocol tests or services. However, to prevent any unwanted changes in behavior, I'll revert this back to our existing behavior.
if ':' in code: | ||
code = code.split(':', 1)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are now 3 distinct if
statements here that aren't using elif/else
. Is the intent to continually mutate the code
value as it goes through each check?
e.g.
com.test.mypackage#this:that#1
-- split(':', 1)
--> com.test.mypackage#this
com.test.mypackage#
-- split('#', 1)
--> this
Then this
is evaluated for the x-amzn-query-error
code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup that's correct. Smithy provides the following examples:
All of the following error values resolve to
FooError
:
FooError
FooError:http://internal.amazon.com/coral/com.amazon.coral.validate/
aws.protocoltests.restjson#FooError
aws.protocoltests.restjson#FooError:http://internal.amazon.com/coral/com.amazon.coral.validate/
botocore/parsers.py
Outdated
if code is not None: | ||
if ':' in code: | ||
code = code.split(':', 1)[0] | ||
if '#' in code: | ||
code = code.split('#', 1)[1] | ||
error['Error']['Code'] = code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're changing behavior again here. Is there additional scope, or why are we post-processing the body for the code
case now?
tests/unit/test_protocols.py
Outdated
PROTOCOL_TEST_IGNORE_LIST_PATH = os.path.join(TEST_DIR, IGNORE_LIST_FILENAME) | ||
with open(PROTOCOL_TEST_IGNORE_LIST_PATH) as f: | ||
PROTOCOL_TEST_IGNORE_LIST = json.load(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use PROTOCOL_TEST_IGNORE_LIST_PATH
somewhere else? Should this whole thing be a self-contained function or pytest
fixture instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add this to a get_protocol_test_ignore_list
function as shown below:
def get_protocol_test_ignore_list():
ignore_list_path = os.path.join(TEST_DIR, IGNORE_LIST_FILENAME)
with open(ignore_list_path) as f:
return json.load(f)
tests/unit/test_protocols.py
Outdated
# If a test case doesn't define a response body, set it to `None`. | ||
if 'body' in case['response']: | ||
body_bytes = case['response']['body'].encode('utf-8') | ||
case['response']['body'] = body_bytes | ||
else: | ||
case['response']['body'] = b'' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are we setting the body to None
? It looks like our fallback is empty bytes. Is this what's causing issues with the xml test above?
@@ -462,7 +503,7 @@ def _get_suite_test_id(): | |||
if len(split) == 2: | |||
suite_id, test_id = int(split[0]), int(split[1]) | |||
else: | |||
suite_id = int(split([0])) | |||
suite_id = int(split[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code even reachable? How did this work before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the top of this file, we provide guidance for running a specific test suite:
To run a single test suite you can set the BOTOCORE_TEST_ID env var:
BOTOCORE_TEST=tests/unit/protocols/input/json.json BOTOCORE_TEST_ID=5
pytest tests/unit/test_protocols.py
However, if you follow this example you'll receive the following error:
TypeError: Invalid format for BOTOCORE_TEST_ID, should be suite_id[:test_id], and both values should be integers.
The else condition here is intended to handle this case by but instead fails due to the incorrect syntax.
* Update ignore list * Resolve some parser issues
57d7547
to
5d9bb32
Compare
cleaned_value = { | ||
k: v for k, v in cleaned_value.items() if v is not None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rust SDK has experienced issues (aws-sdk-rust#1095) when parsing responses from services like quicksight who populate additional union members with null values. Although the quicksight case doesn't specifically affect boto3 due to differences in modeled shape between Smithy and C2J, it’s possible for any other service to send us this type of response for a union structure.
Related Protocol Test Id: AwsJson10DeserializeAllowNulls
serialized_member_names = [ | ||
shape.members[member].serialization.get('name', member) | ||
for member in shape.members | ||
] | ||
if tag not in serialized_member_names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When parsing union structure members, botocore will use the raw member name from the HTTP response to determine if the member is know or not. If unknown, the union is populated with the following member:
“SDK_UNKNOWN_MEMBER”: {
“name”: “<unknown member name>”
}
Botocore fails to consider union members modeled with a locationName
which is used to provide an alias for a member’s name.
Solution: When determining if a union member is know, we should consider serialized names for members modeled with the locationName
trait.
Realted Protocol Test Ids: PostUnionWithJsonNameResponse1
, PostUnionWithJsonNameResponse2
These will fail until #boto#3247 is merged and this is properly rebased. This is intentional - because there's this parallel work going on, this is an easy way to make sure we don't forget the step of merging the protocol tests
Summary
This PR replaces the existing response parsing protocol tests in
botocore/tests/unit/protocols/output
with new tests generated from Smithy protocol test models. We use a version of these Smithy tests that are converted to the format currently supported by our existing test runner (test_protocols.py
). This PR makes updates to the botocore Parsers and protocol test runner to comply with a majority of the new test cases.Granular Protocol Ignore List
This PR introduces a
protocol-tests-ignore-list.json
file that can be used to ignore specific protocol test suites or cases.The structure for this list is defined below:
<test_type>
- There are two types of protocol tests.input
- Request serialization protocol test.output
- Response parsing protocol test.output
will work until the request serialization tests are updated.suites
- A list of test suite descriptions to ignore. When specified, all test cases for the related suite will be ignored.cases
- A list of test case ids to ignore.<protocol_name>
- The protocol name as represented by its corresponding protocol test file name (without the.json
) extension.ec2
,json
,json_1_0
,query
,rest-json
,rest-xml
. This may grow as we add more protocols.